class: center, middle, inverse, title-slide .title[ # Statistical Programming in R: Lecture 1 ] .subtitle[ ## Introduction to R and RStudio ] .author[ ### Josemari Feliciano ] .institute[ ### DATA 412/612 - American University ] .date[ ### Fall 2024 - August 27 ] --- # Today's Agenda 1. Go over the syllabus 2. Introduction to R and RStudio 3. Installing R and RStudio 4. Basic R syntax and programming 5. Hands-on practice --- class: center, middle ## Go over the syllabus ### Let us switch to Canvas where a copy of the course syllabus is located. --- ## Introduction to R and RStudio __What is R?__ - R is the open-source statistical language that seems to have taken over the world of statistics and data science. R is really more than a statistical package - it is a language or an environment designed to produce statistical analysis and production of high quality graphics. - Originally developed by two statisticians at the University of Auckland as a dialect of the S statistical language. - R is both open-source and open development. For more on information, see www.r-project.org/contributors.html --- ## Introduction to R and RStudio __Why learn R?__ - R is a powerful and flexible, free (open source) language designed specifically for statistical computing. - There is an extensive collection packages created by R users to extend R and implement modern statistical techniques. - Furthermore, R is an interpreted, high level language, which means that we can write code and run it in real time line by line without needing to worry about low level programming such as memory management. --- class: center, middle <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#images/lecture1rscreenshot.png" alt="Figure 1. A screenshot of how R looks like in MacOS. " width="100%" /> <p class="caption">Figure 1. A screenshot of how R looks like in MacOS. </p> </div> --- ## Introduction to R and RStudio __What is RStudio?__ - RStudio is an integrated development environment<sup>1</sup>, or IDE, for R programming, which you can download from https://posit.co/download/rstudio-desktop/. - RStudio helps R users to effectively use R by making things easier. __One example on next page__ - RStudio is updated a couple of times a year, and it will automatically let you know when a new version is out, so there’s no need to check back. It’s a good idea to upgrade regularly to take advantage of the latest and greatest features. .footnote[ [1] IDEs are tools designed to increase programmer productivity by combining common activities of writing software into a single application: editing source code, building executables, and debugging. ] --- class: middle <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#images/lecture1environmentpane.png" alt="Figure 2. This is called the Environment Pane from RStudio which allows users to track which variables or data have been saved into the R environment." width="100%" /> <p class="caption">Figure 2. This is called the Environment Pane from RStudio which allows users to track which variables or data have been saved into the R environment.</p> </div> --- class: center, middle ## My personal experience with R ### Some examples of past work that leveraged both R and RStudio --- class: center, middle ## Installing R and RStudio ### Let us switch to Canvas where a copy of installation instructions is located. ### We will spend up to 10 mins to ensure both R and RStudio are installed into your computer. --- class: center, middle # Basic R syntax and programming --- class: center, middle <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#images/lecture1consolepane.png" alt="Figure 3. This is called the Console Pane from RStudio (Linux Version) which allows users to type in and execute scripts." width="100%" /> <p class="caption">Figure 3. This is called the Console Pane from RStudio (Linux Version) which allows users to type in and execute scripts.</p> </div> --- class: center, middle <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#images/Lecture1Typing.png" alt="Figure 4. Here is an example of a simple script for addition. Type '4+2' then press Enter/return." width="100%" /> <p class="caption">Figure 4. Here is an example of a simple script for addition. Type '4+2' then press Enter/return.</p> </div> --- class: center, middle <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#images/lecture1operation.png" alt="Figure 5. Five basic arithmetic operators you can perform in R." width="100%" /> <p class="caption">Figure 5. Five basic arithmetic operators you can perform in R.</p> </div> __Note:__ `+` is addition; `-` is substraction; `*` is multiplication; `/` is division; and `**` or `^` is exponentiation. --- <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#images/lecture1logic.png" alt="Figure 6. Four basic logical operators you can perform in R." width="70%" /> <p class="caption">Figure 6. Four basic logical operators you can perform in R.</p> </div> For now, trivial examples to show how logical operators work: - In `4 > 2`, you asking R to evaluate: is the left hand side value greater than the right hand value. In that case, the answer is TRUE. - In `4 < 2`, you asking R to evaluate: is the left hand side value less than the right hand value. In that case, the answer is FALSE. - In `4 == 2`, you asking R to evaluate: is the left hand side value equal to the right hand value. In that case, the answer is FALSE. - In `4 != 2`, you asking R to evaluate: is the left hand side value NOT equal to the right hand value. In that case, the answer is TRUE. --- class: center, middle <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#images/lecture1previewdplyrfilter.png" alt="Figure 7. Preview of what is to come later in the course. This is an exerpt from a guest lecture I gave to show you an example of the '==' logical operator. You will later learn the dplyr package." width="70%" /> <p class="caption">Figure 7. Preview of what is to come later in the course. This is an exerpt from a guest lecture I gave to show you an example of the '==' logical operator. You will later learn the dplyr package.</p> </div> --- # Console Pane - In general, we will rarely type in and executive scripts from the console pane. Generally, you want to save scripts you generate and execute within an R file (more on this later). - Today is one of those exemption. Within the next few slides, we will install packages that you will need in the course. --- # The tidyverse - An R package is a collection of functions, data, and documentation that extends the capabilities of base R. - Using packages is key to the successful use of R. The majority of the packages that you will learn in this course are part of the so-called tidyverse. - All packages in the tidyverse share a common philosophy of data and R programming and are designed to work together. --- # Install the tidyverse packages Execute this code within your console pane: install.packages("tidyverse") You only need to install this once. If you've used R previously, it is possible you might have it already. Once you have tidyverse installed, you need to load the package each time you start a new R session. --- # Loading packages in R Note: No quotation symbols when loading a package. Again, you need to load the package each time you start a new R session. ``` r library(tidyverse) ``` ``` ## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ── ## ✔ dplyr 1.1.4 ✔ readr 2.1.5 ## ✔ forcats 1.0.0 ✔ stringr 1.5.1 ## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1 ## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1 ## ✔ purrr 1.0.2 ## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ── ## ✖ dplyr::filter() masks stats::filter() ## ✖ dplyr::lag() masks stats::lag() ## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors ``` --- # Source pane: missing <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#images/lecture1threepanes.png" alt="Figure 8. Three panes you see upon opening RStudio. Initially, it excludes a fourth pane called source pane." width="70%" /> <p class="caption">Figure 8. Three panes you see upon opening RStudio. Initially, it excludes a fourth pane called source pane.</p> </div> --- # Source pane: Creating and saving R Scripts <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#images/lecture1openingfourthpane.png" alt="Figure 9. The fourth pane (source pane) will appear when you create a new file called R Script (or load an existing R file)." width="60%" /> <p class="caption">Figure 9. The fourth pane (source pane) will appear when you create a new file called R Script (or load an existing R file).</p> </div> __For practice:__ Click File > New File > RScript. Within the file, type in one of the scripts you learned (e.g., one of the five basic arithmetic operators). Once you are done: Click File > Save As. Name the file however you want and save it within the location that you can remember. Close RStudio. And try double clicking the file from the location where you saved it. --- <div class="figure" style="text-align: center"> <img src="data:image/png;base64,#images/lecture1rscript.png" alt="Figure 10. What you will see once you have a saved loaded file and after running script within them." width="90%" /> <p class="caption">Figure 10. What you will see once you have a saved loaded file and after running script within them.</p> </div> Note: To run a script from an RScript file, click anywhere on line 1 (or highlight the code you want to run), and press the 'Run' button on the upper right corner of the source pane. --- class: center, middle ## Hands-on Practice ### Download Exercise1.R from Canvas, then follow demo provided in class using your computer. --- class: center, middle ## Homework will be posted at Canvas on August 28. It will be due on September 3 11:59pm.